Visual Capitalist published an animation created by James Eagle showing how smartphone vendor market shares developed over 30 years. In the center of the visualization is a donut chart displaying monthly market share values. The chart includes a legend, which repeats the share per manufacturer. Manufacturers displayed in the donut chart are highlighted in the legend.
The goal of this tutorial is to create the animated donut chart in R with {ggplot2}. We will use another data source due to availability and (for now) not create the legend.
Animation: How the Mobile Phone Market Has Evolved Over 30 Years 📲
— Visual Capitalist (@VisualCap) May 4, 2022
Article: https://t.co/7JfDGv4EYe
Courtesy of creator @JamesEagle17 pic.twitter.com/4wrxHE3LSQ
Let’s load the packages we will use for creating the animation, especially {ggplot2} via the Tidyverse and {gganimate}.
library(tidyverse)
library(gganimate)
library(ggtext)
library(lubridate)
Statcounter provides mobile vendor market shares back to 2010. The original animation goes back to the 1990s and uses data which are not openly available.
Statcounter gives an overview about how the data is collected in their FAQ section:
Statcounter is a web analytics service. Our tracking code is installed on more than 2 million sites globally. These sites cover various activities and geographic locations. Every month, we record billions of page views to these sites. For each page view, we analyse the browser/operating system/screen resolution used and we establish if the page view is from a mobile device.
Statcounter data is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License. It can be downloaded in CSV format via https://gs.statcounter.com/vendor-market-share/mobile/worldwide/#monthly-201003-202205. After the download, place the CSV file in your project directory before loading it into the R session. We are using the maximum period available, which is March 2010 to May 2022 (as of writing this tutorial).
filename <- "vendor-ww-monthly-201003-202205.csv"
df_raw <- read_csv(filename)
## Rows: 147 Columns: 70
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Date
## dbl (69): Samsung, Apple, Unknown, Nokia, Huawei, Xiaomi, LG, Oppo, Sony, Mo...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
n_vendors <- ncol(df_raw) - 1
head(df_raw)
## # A tibble: 6 × 70
## Date Samsung Apple Unknown Nokia Huawei Xiaomi LG Oppo Sony Motorola
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2010-04 2.37 33.2 0 37.6 0 0 0.23 0 8.3 0.3
## 2 2010-05 3.27 33.1 0 37.4 0 0 0.08 0 7.96 0.12
## 3 2010-06 3.86 30.7 0 38.3 0 0 0.08 0 7.9 0.11
## 4 2010-07 3.83 30.1 0 36.8 0 0 0.21 0 7.9 0.29
## 5 2010-08 4.07 29.8 0 36.4 0 0 0.22 0 7.81 0.35
## 6 2010-09 4.53 26.7 0 38.1 0 0 0.18 0 7.84 0.4
## # … with 59 more variables: HTC <dbl>, Lenovo <dbl>, RIM <dbl>, Micromax <dbl>,
## # Mobicel <dbl>, Asus <dbl>, `General Mobile` <dbl>, Google <dbl>, BBK <dbl>,
## # Vivo <dbl>, ZTE <dbl>, Alcatel <dbl>, Realme <dbl>, OnePlus <dbl>,
## # Tecno <dbl>, Infinix <dbl>, Lava <dbl>, Gionee <dbl>, Vodafone <dbl>,
## # Turkcell <dbl>, Wiko <dbl>, Coolpad <dbl>, Lyf <dbl>, Casper <dbl>,
## # Itel <dbl>, Hisense <dbl>, Vestel <dbl>, Spice <dbl>, AIS <dbl>,
## # Meizu <dbl>, bq <dbl>, QMobile <dbl>, LeEco <dbl>, Panasonic <dbl>, …
Each vendor’s market share is coded in a column. In total, the market shares from 69 vendors are available in the dataframe.
colnames(df_raw)
## [1] "Date" "Samsung" "Apple" "Unknown"
## [5] "Nokia" "Huawei" "Xiaomi" "LG"
## [9] "Oppo" "Sony" "Motorola" "HTC"
## [13] "Lenovo" "RIM" "Micromax" "Mobicel"
## [17] "Asus" "General Mobile" "Google" "BBK"
## [21] "Vivo" "ZTE" "Alcatel" "Realme"
## [25] "OnePlus" "Tecno" "Infinix" "Lava"
## [29] "Gionee" "Vodafone" "Turkcell" "Wiko"
## [33] "Coolpad" "Lyf" "Casper" "Itel"
## [37] "Hisense" "Vestel" "Spice" "AIS"
## [41] "Meizu" "bq" "QMobile" "LeEco"
## [45] "Panasonic" "True" "Acer" "Kyocera"
## [49] "Reliance Digital" "Pantech" "InFocus" "Intex"
## [53] "Xolo" "Nintendo" "Smartfren" "Archos"
## [57] "Blu" "HP" "i-Mobile" "Condor"
## [61] "Avea" "Karbonn" "dtac" "Yu"
## [65] "T-Mobile" "Symphony" "Sharp" "Lanix"
## [69] "Infinex" "Other"
For our plot, we have to transform the dataframe into long format, i.e. each vendor has to be encoded in a row instead of a column. Since displaying all vendors would lead to a cluttered chart, we lump vendors with smaller market shares into an “Other” category. There are a couple of months with a rather high shares of “Unknown” - we group “Unknown” to other as well. All vendors with a market share below the threshold will be recoded as “Other” month by month.
threshold_for_lumping <- 3.1
df_long <- df_raw %>%
pivot_longer(cols = -Date, names_to = "vendor", values_to = "market_share") %>%
# group vendors with smaller market shares to "Other" based on monthly shares
mutate(vendor2 = ifelse(market_share < threshold_for_lumping |
vendor == "Unknown", "Other", vendor),
date = ym(Date)) %>%
# the data from March 2010 is incomplete, remove it
filter(date > as_date("2010-03-01")) %>%
count(date, vendor2, wt = market_share, name = "market_share")
The first few rows of the transformed dataframe:
head(df_long)
## # A tibble: 6 × 3
## date vendor2 market_share
## <date> <chr> <dbl>
## 1 2010-04-01 Apple 33.2
## 2 2010-04-01 Nokia 37.6
## 3 2010-04-01 Other 5.11
## 4 2010-04-01 RIM 15.9
## 5 2010-04-01 Sony 8.3
## 6 2010-05-01 Apple 33.1
All vendors in the new grouped variable. These vendors will appear in the chart at least in one month with a market share above the threshold:
unique(df_long$vendor2)
## [1] "Apple" "Nokia" "Other" "RIM" "Sony" "Samsung" "HTC"
## [8] "LG" "Huawei" "Lenovo" "Xiaomi" "Oppo" "Mobicel" "Vivo"
Let’s create the first basic graph, capturing the market share for each month of the first year:
df_long %>%
filter(date <= as_date("2011-03-01")) %>%
ggplot(aes(vendor2, market_share)) +
geom_col() +
coord_flip() +
facet_wrap(vars(date))
{ggpubr} is a great package for creating several chart types without all the details of {ggplot2}, including donut charts. Since we will be making some customizations for our animation, we will create the plot from scratch in {ggplot2}, though.
The base for the donut chart is a pie chart. Creating pie charts in {ggplot2} works just like creating a stacked bar chart in a polar coordinate system. We achieve this by adding coord_polar(theta = "y") to the plot. For this static chart we select the most recent month.
Instead of the default color palette we use the Lapras palette from the {palettetown} package.
df_long %>%
filter(date == as_date("2022-05-01")) %>%
ggplot(aes(x = 1, market_share, group = vendor2)) +
geom_col(aes(fill = vendor2), position = "fill") +
paletteer::scale_fill_paletteer_d("palettetown::lapras") +
coord_polar(theta = "y") +
theme_void() # removes all theme elements
Adding the labels for each category is a bit trickier. We have to calculate the label position from the cumulative sums.
p <- df_long %>%
filter(date == as_date("2022-05-01")) %>%
# calculate the label positions and format the label texts
arrange(date, market_share) %>%
mutate(label_pos = cumsum(market_share) / sum(market_share)
- 0.5 * market_share / sum(market_share),
label = sprintf("%s\n%s %%", vendor2, market_share),
vendor2 = fct_reorder(vendor2, -market_share)) %>%
ggplot(aes(x = 1, market_share, group = vendor2)) +
geom_col(aes(fill = vendor2), position = "fill") +
geom_label(aes(x = 1.5, label = label, y = label_pos)) +
paletteer::scale_fill_paletteer_d("palettetown::lapras") +
coord_polar(theta = "y") +
guides(fill = "none") +
theme_void()
p
We just simply add a white circle (i.e. same color as the background) on top of the pie chart. Voilà, a donut chart. Adjust donut_hole_width to change the size of the inner ring. A value of 0 will result in a pie chart, a value of 1.5 or greater will cover the whole pie chart.
Inside the inner ring we display the current month using geom_text().
# adjust the size of the inner ring
donut_hole_width <- 0.75
p +
annotate("rect", xmin = 0, xmax = donut_hole_width, ymin = -Inf, ymax = Inf,
fill = "white") +
geom_text(aes(x = 0, y = 0, label = format(date, "%B\n%Y")), stat = "unique",
size = 8)
p_donut <-
df_long %>%
mutate(vendor2 = factor(vendor2, levels = unique(df_long$vendor2))) %>%
# now we need to calculate the label position within each month
group_by(date) %>%
arrange(desc(vendor2), .by_group = TRUE) %>%
mutate(
label_pos = cumsum(market_share) / sum(market_share)
- 0.5 * market_share / sum(market_share),
label = sprintf("%s\n%s %%", vendor2,
scales::number(market_share, accuracy = 0.1)),
label = fct_reorder(label, market_share)) %>%
ungroup() %>%
ggplot(aes(x = 1, market_share, group = vendor2)) +
geom_col(aes(fill = vendor2), position = "fill") +
ggrepel::geom_text_repel(aes(x = 1.5, label = label, y = label_pos),
hjust = 0, family = "Fira Sans", segment.size = 0.3,
min.segment.length = 0, nudge_x = 0.3, point.padding = 1e-05,
label.padding = 0.3, color = "white") +
# semi-transparent ring
annotate("rect", xmin = 0, xmax = donut_hole_width + 0.15, ymin = -Inf, ymax = Inf,
fill = alpha("grey4", 0.25)) +
# inner ring
annotate("rect", xmin = 0, xmax = donut_hole_width, ymin = -Inf, ymax = Inf,
fill = "grey4") +
geom_richtext(
aes(
x = 0, y = 0,
label = sprintf(
"<span style='color: grey80'>%s</span><br>
<span style='font-size: 40pt'>%s</span>",
format(date, "%B"), year(date))),
stat = "unique", size = 8, family = "Fira Sans SemiBold", color = "white",
fill = NA, label.size = 0, lineheight = 1.67) +
paletteer::scale_fill_paletteer_d("palettetown::lapras") +
coord_polar(theta = "y") +
guides(fill = "none", color = "none") +
labs(
title = "Mobile phone market 2010-2022",
subtitle = "Market share of mobile phone vendors"
) +
theme_void(base_family = "Fira Sans") +
theme(
plot.background = element_rect(color = NA, fill = "grey4"),
plot.margin = margin(10, 10, 10, 10),
text = element_text(color = "grey80"),
plot.title = element_text(
family = "Fira Sans SemiBold", color = "white", size = 16),
plot.title.position = "plot"
)
## Warning: Ignoring unknown parameters: label.padding
# p_anim <- p_donut +
# transition_states(date)
#
# anim <- animate(p_anim, res = 100, width = 720, height = 640, fps = 12,
# duration = 60)
# anim_save("animated-donut-chart.gif", anim)